Talking to the Scheduler¶
Jobscripts¶
As mentioned before, we will need a jobscript if we are to access the scheduler system. This can be done in two ways:
Setting the
script
parameter of yourDataset
Defining your jobscript using a
BaseComputer
class
Lets start with the first option, setting a script manually.
Using a Direct Script¶
If you already know what should be in your jobscript, the “simplest” way to do this is to set the script
attribute of a Dataset
to a string.
Hardcoding the script this way is inflexible, but does work. Lets demonstrate with a basic jobscript:
[2]:
script = """
#SBATCH --ntasks-per-node=64
#SBATCH --cpus-per-task=4
#SBATCH --nodes=4
#SBATCH --queue=test
#SBATCH --account=myuser
#SBATCH --walltime=12:00:00
#SBATCH --exclusive
module load python
module load module/version"""
Script created, lets create a Dataset
and URL
for testing.
Note
You will need to manually set your submitter
to your correct submitter. It should be listed in the documentation for your machine.
[3]:
from remotemanager import Dataset, URL
def f(inp):
return inp
url = URL(submitter="sbatch")
ds = Dataset(f, url=url, skip=False)
ds.script = script
Warning! The script has changed, this will allow runners to be resubmitted!
[4]:
ds.append_run({"inp": True})
ds.run(dry_run=True)
appended run runner-0
Running Dataset
assessing run for runner dataset-9ebf1589-runner-0... running
Transferring 5 Files... Done
launch command: cd temp_runner_remote && bash dataset-9ebf1589-master.sh
[5]:
print(ds.runners[0].jobscript.content)
#!/bin/bash
#SBATCH --ntasks-per-node=64
#SBATCH --cpus-per-task=4
#SBATCH --nodes=4
#SBATCH --queue=test
#SBATCH --account=myuser
#SBATCH --walltime=12:00:00
#SBATCH --exclusive
module load python
module load module/version
source $sourcedir/dataset-9ebf1589-repo.sh
python dataset-9ebf1589-runner-0-run.py 2>> dataset-9ebf1589-runner-0-error.out
The script is exactly what we set at the dataset level. This works as expected, assuming you want to submit your runners with the same script.
Using a Computer¶
A manual jobscript can be fine for quick testing purposes (or if it never changes). However for most cases, a more dynamic solution is required.
This is where the Computer comes in, the simplest way to describe these structures is as a “translation layer” between a common set of URL
properties and whatever the scheduler is expecting.
The next tutorials will cover the usage of these more advanced objects.